Using Visualization to Support Data Mining of Large Existing Databases
نویسندگان
چکیده
In this paper, we present ideas how visualization technology can be used to improve the difficult process of querying very large databases. With our VisDB system, we try to provide visual support not only for the query specification process. but also for evaluating query results and. thereafter, refining the query accordinky. The main idea of our system is to represent as many data items as possible by the pixels of the display device. By arran~ng and coloring the pixels according to the relevance for the query, the user gets a visual impression of the resulting data set and of its relevance for the query.. Using an interactive query interface, the user may change the query dynamically and receives immediate feedback by the visual representation of the resulting data set. By using multiple windows for different parts of the query, the user gets visual feedback for each part of the query and, therefore, may easier understand the overall result. To support complex queries, we introduce the notion of 'approximate joins' which allow the user to find dam items that only approximately fulfill join conditions. We also present ideas how our technique may be extended to support the interoperation of heterogeneous databases. Finally, we discuss the performance problems that are caused by interfacing to existing database systems and present ideas to solve these problems by using data structures supporting a multidimensional search of the database. Ke.vwords: Visualizing Large Data Sets. Visualizing Multidimensional Multivariate Data, Data Mining, Visual Query Systems. Visual Relevance Feedback. Interfaces to Database Systems
منابع مشابه
DataJewel: Tightly Integrating Visualization with Temporal Data Mining
In this paper we describe DataJewel, a new architecture designed for temporal data mining. It tightly integrates a visualization component, an algorithmic component and a database component. We introduce a new visualization technique called CalendarView as an implementation of the visualization component. We show how algorithms can be tightly integrated with the visualization component and that...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملModeling Multidimensional Australian Resources Data for an effective Business Knowledge Management
Historical Australian resources (exploration and production) data are stored in data warehouse environment in the form of relational and hierarchical data structures in multiple dimensions. Significantly, these resources databases consist of periodic dimension, characterizing the role of period and its relation among other data dimensions, their attributes and fact tables. Data mining of period...
متن کاملMining Spatial Data Using An Interactive Rule-Based Approach
With the advent of very large spatial databases, it is beyond human capacity to examine and understand the information contained within such volumes of data directly. Although data mining has been recognized as a key means of finding patterns in large databases, general data mining methods alone are not sufficient for spatial data mining. The strengths of the computer’s ability to perform numer...
متن کاملUsing data mining techniques for predicting the survival rate of breast cancer patients: a review article
This review was conducted between December 2018 and March 2019 at Isfahan University of Medical Sciences. A review of various studies revealed what data mining techniques to predict the probability of survival, what risk factors for these predictions, what criteria for evaluating data mining techniques, and finally what data sources for it have been used to predict the surv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1993